460 research outputs found

    A Study on Clustering for Clustering Based Image De-Noising

    Full text link
    In this paper, the problem of de-noising of an image contaminated with Additive White Gaussian Noise (AWGN) is studied. This subject is an open problem in signal processing for more than 50 years. Local methods suggested in recent years, have obtained better results than global methods. However by more intelligent training in such a way that first, important data is more effective for training, second, clustering in such way that training blocks lie in low-rank subspaces, we can design a dictionary applicable for image de-noising and obtain results near the state of the art local methods. In the present paper, we suggest a method based on global clustering of image constructing blocks. As the type of clustering plays an important role in clustering-based de-noising methods, we address two questions about the clustering. The first, which parts of the data should be considered for clustering? and the second, what data clustering method is suitable for de-noising.? Then clustering is exploited to learn an over complete dictionary. By obtaining sparse decomposition of the noisy image blocks in terms of the dictionary atoms, the de-noised version is achieved. In addition to our framework, 7 popular dictionary learning methods are simulated and compared. The results are compared based on two major factors: (1) de-noising performance and (2) execution time. Experimental results show that our dictionary learning framework outperforms its competitors in terms of both factors.Comment: 9 pages, 8 figures, Journal of Information Systems and Telecommunications (JIST

    Milton's god: democrat or tyrant?

    Get PDF
    Politeness is a universal phenomenon that is present in every human interaction. Many theorists have attempted to theorize politeness the most important of whom are Penelope Brown and Stephen C. Levinson. The application of their theory has been extended to include literary works which are conversational in nature like drama or works whose building blocks consist of dialogues. This study tries to apply Politeness Theory to Milton's Paradise Lost in order to solve the age-old dispute over Milton's God to whom contradictory characteristics of democracy and tyranny are ascribed. It will be shown that in the conversation that takes place between God and residents of Heaven, God is more careful about politeness strategies despite his supremacy and it seems to be at odds with tyrannical features attributed to him

    Fast and efficient speech enhancement with variational autoencoders

    Get PDF
    Unsupervised speech enhancement based on variational autoencoders has shown promising performance compared with the commonly used supervised methods. This approach involves the use of a pre-trained deep speech prior along with a parametric noise model, where the noise parameters are learned from the noisy speech signal with an expectationmaximization (EM)-based method. The E-step involves an intractable latent posterior distribution. Existing algorithms to solve this step are either based on computationally heavy Monte Carlo Markov Chain sampling methods and variational inference, or inefficient optimization-based methods. In this paper, we propose a new approach based on Langevin dynamics that generates multiple sequences of samples and comes with a total variation-based regularization to incorporate temporal correlations of latent vectors. Our experiments demonstrate that the developed framework makes an effective compromise between computational efficiency and enhancement quality, and outperforms existing methods

    Posterior sampling algorithms for unsupervised speech enhancement with recurrent variational autoencoder

    Full text link
    In this paper, we address the unsupervised speech enhancement problem based on recurrent variational autoencoder (RVAE). This approach offers promising generalization performance over the supervised counterpart. Nevertheless, the involved iterative variational expectation-maximization (VEM) process at test time, which relies on a variational inference method, results in high computational complexity. To tackle this issue, we present efficient sampling techniques based on Langevin dynamics and Metropolis-Hasting algorithms, adapted to the EM-based speech enhancement with RVAE. By directly sampling from the intractable posterior distribution within the EM process, we circumvent the intricacies of variational inference. We conduct a series of experiments, comparing the proposed methods with VEM and a state-of-the-art supervised speech enhancement approach based on diffusion models. The results reveal that our sampling-based algorithms significantly outperform VEM, not only in terms of computational efficiency but also in overall performance. Furthermore, when compared to the supervised baseline, our methods showcase robust generalization performance in mismatched test conditions

    Modified Gaussian Radial Basis Function Method for the Burgers Systems

    Get PDF
    In this paper, the systems of variable-coefficient coupled Burgers equation are solved by a free mesh method. The method is based on the collocation points with the modified Gaussian (MGA) radial basis function (RBF). Dependent parameters and independent parameters and their effect on the stability are shown. The accuracy and efficiency of the method has been checked by two examples. The results of numerical experiments are compared with analytical solutions by calculating errors infinity-norm

    Switching Variational Auto-Encoders for Noise-Agnostic Audio-visual Speech Enhancement

    Get PDF
    Recently, audio-visual speech enhancement has been tackled in the unsupervised settings based on variational auto-encoders (VAEs), where during training only clean data is used to train a generative model for speech, which at test time is combined with a noise model, e.g. nonnegative matrix factorization (NMF), whose parameters are learned without supervision. Consequently, the proposed model is agnostic to the noise type. When visual data are clean, audio-visual VAE-based architectures usually outperform the audio-only counterpart. The opposite happens when the visual data are corrupted by clutter, e.g. the speaker not facing the camera. In this paper, we propose to find the optimal combination of these two architectures through time. More precisely, we introduce the use of a latent sequential variable with Markovian dependencies to switch between different VAE architectures through time in an unsupervised manner: leading to switching variational auto-encoder (SwVAE). We propose a variational factorization to approximate the computationally intractable posterior distribution. We also derive the corresponding variational expectation-maximization algorithm to estimate the parameters of the model and enhance the speech signal. Our experiments demonstrate the promising performance of SwVAE.Comment: 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP

    Eve's unEven relationship with Adam: Milton's Paradise Lost in the light of politeness theory

    Get PDF
    Feminists, among others, have found Eve's representation in Milton's Paradise Lost problematic over the last centuries. Some of them consider Eve to be Adam’s inferior while others find traces of egalitarian relationship between them. This study uses Penelope Brown and Stephen C. Levinson's Politeness Theory and applies it to the conversations between Adam and Eve prior to the Fall in order to address this issue. It is demonstrated in this article that, before the Fall, Eve always exercises less power than Adam except for a brief moment that she achieves equality

    Audio-visual speech enhancement with a deep Kalman filter generative model

    Get PDF
    Deep latent variable generative models based on variational autoencoder (VAE) have shown promising performance for audiovisual speech enhancement (AVSE). The underlying idea is to learn a VAEbased audiovisual prior distribution for clean speech data, and then combine it with a statistical noise model to recover a speech signal from a noisy audio recording and video (lip images) of the target speaker. Existing generative models developed for AVSE do not take into account the sequential nature of speech data, which prevents them from fully incorporating the power of visual data. In this paper, we present an audiovisual deep Kalman filter (AV-DKF) generative model which assumes a first-order Markov chain model for the latent variables and effectively fuses audiovisual data. Moreover, we develop an efficient inference methodology to estimate speech signals at test time. We conduct a set of experiments to compare different variants of generative models for speech enhancement. The results demonstrate the superiority of the AV-DKF model compared with both its audio-only version and the non-sequential audio-only and audiovisual VAE-based models
    • …
    corecore